-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bootstrap): save connected peers as backup bootstrap peers #8856
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
@schomatis: let's put this in the repo datastore for now, so that we are free to change/remove it later without breaking folks. |
b4ad48f
to
efb9fe4
Compare
@guseggert Moved to datastore. Ready for review. Will need help with tests if required to land this. |
4919894
to
f2642f0
Compare
@guseggert : this is a mental note that I'd like to flag this one for code review Friday so we get it over the line. |
@BigLep Any update on this? I've read into the issue and it seems the end goal is to move this functionality into go-libp2p (libp2p/go-libp2p-kad-dht#254), is this still the plan? |
This feature would be great to have. It enables:
|
Thanks for the flags. This has gotten lost amongst all the other priorities that have emerged so far in 2023. I have in the list for 0.20. A key thing we're missing is what is remaining to land this. |
f2642f0
to
94b3b8d
Compare
I just rebased this and I will try taking a look at it during this week or the next week. This would be a great addition. @Jorropo it would be great if you could take a look since you've reviewed it before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went over the code, cleaned up some comments and pushed a fix. Also updated the main PR description with how to test manually. The code looks good to me and works well for me. What I think is left to review:
- Decide if new variable names are fine, as well as defaults. I have no problems with them.
- I would like to test this, but I don't see how we'd do it.
cc @ipfs/kubo-maintainers
8d4f6b3
to
5cd0c4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits below, but it does not have to be perfect, anything here is better than nothing.
Important part here is to have regression tests that ensure backup bootstrappers are used.
@hacdias I think testing would live in https://github.com/ipfs/kubo/tree/master/test/cli and be similar to swarm_test.go
or dht_legacy_test.go
:
- configure node to have
ipfs config --json Bootstrap []
andDiscovery.MDNS.Enabled: false
(we want it to boot and have no peers) - set bootstrap backup interval to to something small, like 250ms (could be config proposed below inline, or could be env override via os.LookupEnv)
- start the daemon, confirm it has no peers (expected)
- start the second peer (with default config and
Discovery.MDNS.Enabled: false
, we want it to be a member of public swarm, but not connect to the first one) and confirm it has some peers - manually connect first one to the second one
- confirm the first one learned about other peers (
ipfs swarm peers
longer than 1) - shut down both daemons, and then start first one again
- it should be connected to something thanks to the backed up peers (
ipfs swarm peers
longer than 1)
(to make this test work offline we could introduce a third node )
a64a86a
to
416c6b6
Compare
@lidel I added a test that runs completely separately from the rest of the network. I made a slight change compared to what you suggested. We have 0, 1 and 2. They all have the same config (no bootstrap nodes, no MDNS enabled, 250ms for backup bootstrap interval). The following happens:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving this. I will wait for one more approval since I wrote the test 😃
416c6b6
to
93aa0f8
Compare
bbed7d9
to
1f4cbd6
Compare
1f4cbd6
to
7339162
Compare
MinPeerThreshold: 4, | ||
Period: 30 * time.Second, | ||
ConnectionTimeout: (30 * time.Second) / 3, // Perod / 3 | ||
BackupBootstrapInterval: 1 * time.Hour, | ||
MaxBackupBootstrapSize: 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ℹ️ We can expose these later under Internal.*
if needed (in future PRs).
For now, only Internal.BackupBootstrapInterval
is exposed for use in the test/cli/backup_bootstrap_test.go
test. Not documenting it, as we may change the name/location later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one baked for long enough -- does the basic thing, we can refine in follow-up PRs, but is already way better than nothing.
Thank you all for pushing on this. Added changelog entry, merging when CI passes.
@lidel : Awesome! I wanted to confirm that the default bootstrap peers will still stay in the list, but not now the list is augmented with additional peers based on past discovery. The reason I'm asking is because of probelab's measurements that try and ascertain the DHT network node size based on the number of peerids that connect to it. |
@BigLep the default bootstrap peers stays. We add additional backup peers based on our connections. Those will be dialed if the default bootstrap list does not work. |
This PR aims to close #3926.
Every 1 hour (
DefaultBootstrapConfig.BackupBootstrapInterval
, can override viaInternal.BackupBootstrapInterval
) we save a number of peers we're connected to in Kubo's datastore. If we can't connect to our "official" bootstrap peers defined in theBootstrap
config (if we don't hit the target inDefaultBootstrapConfig.MinPeerThreshold
, again hardcoded, why is this not in the config file?) we resort to connecting to some of the peers from the above list.How To Manually Test
Running a few seconds we'll see the log:
Stop the daemon.
Backup and remove the official bootstrap peers, CAREFUL:
At this point, the version in master should not be able to connect to the network:
With this PR:
Restore bootstrap with
ipfs config --json Bootstrap "$(cat bootstrap-peers.back)"